A Joint Model of Rhetorical Discourse Structure and Summarization
نویسندگان
چکیده
In Rhetorical Structure Theory, discourse units participate in asymmetric relationships, with one element acting as the nucleus and the other as the satellite. In the resulting tree-like nuclearity structure, the importance of each discourse unit can be measured by the number of relations in which it acts as the nucleus or as the satellite. Existing approaches to automatically parsing such structures suffer from two problems: they employ local inference techniques that do not capture documentlevel structural regularities, and they rely on annotated training data, which is expensive to obtain at the discourse level. We investigate the SampleRank structure learning algorithm as a potential solution to both problems. SampleRank allows us to incorporate arbitrary document-level features in a global stochastic inference algorithm. Furthermore, it enables the training of a joint model of discourse structure and summarization, which can be learned from document-level summaries alone, without discourse-level supervision. We obtain mixed results in the fully supervised case, and negative results for the joint model of discourse structure and summarization.
منابع مشابه
Joint semantic discourse models for automatic multi-document summarization
Automatic multi-document summarization aims at selecting the essential content of related documents and presenting it in a summary. In this paper, we propose some methods for automatic summarization based on Rhetorical Structure Theory and Cross-document Structure Theory. They are chosen in order to properly address the relevance of information, multidocument phenomena and subtopical distributi...
متن کاملThai Rhetorical Structure Analysis
Rhetorical structure analysis (RSA) explores discourse relations among elementary discourse units (EDUs) in a text. It is very useful in many text processing tasks employing relationships among EDUs such as text understanding, summarization, and question-answering. Thai language with its distinctive linguistic characteristics requires a unique technique. This article proposes an approach for Th...
متن کاملMining Discourse Markers For Chinese Textual Summarization
Discourse markers foreshadow the message thrust of texts and saliently guide their rhetorical structure which are important for content filtering and text abstraction. This paper reports on efforts to automatically identify and classify discourse markers in Chinese texts using heuristic-based and corpus-based data-mining methods, as an integral part of automatic text summarization via rhetorica...
متن کاملHierarchical Discourse Parsing Based on Similarity Metrics
Attentional State Theory and Rhetorical Structure Theory are two predominant theories of discourse parsing. Combining these two approaches, in this paper, we describe a novel approach for discourse parsing. The resulting discourse tree structure retains following properties: structure of purpose from Attentional State Theory and relations between sentences from Rhetorical Structure Theory. We d...
متن کاملHa, Eun Young. Modeling Discourse Structure and Temporal Event Relations for Automated Document Summarization with Markov Logic Networks. (under the Direction of Modeling Discourse Structure and Temporal Event Relations for Automated Document Summarization with Markov Logic Networks
HA, EUN YOUNG. Modeling Discourse Structure and Temporal Event Relations for Automated Document Summarization with Markov Logic Networks. (Under the direction of James C. Lester.) Recent years have seen significant progress in natural language processing. A key challenge posed by many natural language applications ranging from text summarization to question answering and machine translation is ...
متن کامل